Skip to content

fix(gateway): try harder to detect Podman#1536

Merged
krishicks merged 1 commit into
mainfrom
hicks/podman-autodetect
Jun 3, 2026
Merged

fix(gateway): try harder to detect Podman#1536
krishicks merged 1 commit into
mainfrom
hicks/podman-autodetect

Conversation

@krishicks
Copy link
Copy Markdown
Collaborator

@krishicks krishicks commented May 22, 2026

Summary

Auto-detection previously treated Podman as available only when the podman CLI was visible on PATH. However, package manager services can run with a restricted PATH, which lets Docker be selected even when a Podman API socket is reachable. Additionally, podman may symlink /var/run/docker.sock to podman's machine unix socket, which would be incorrectly detected as Docker. Worse still: the podman machine may not even be running.

This replaces the Podman binary check with a functional HTTP probe against the standard Podman socket paths. The probe requires /_ping to answer with a Libpod-Api-Version header before treating the socket as Podman, which lets the gateway select the embedded Podman driver only when the API is usable.

Related Issue

Changes

  • Detect Podman when either the podman CLI is available or a standard Podman API socket is reachable.
  • Add Podman socket candidate probing for OPENSHELL_PODMAN_SOCKET, XDG_RUNTIME_DIR, Linux /run/user/{uid}, and macOS Podman machine socket paths.
  • Add/update unit tests for Podman socket candidate selection and reachable Unix socket detection.

Testing

  • mise run pre-commit passes
  • Unit tests added/updated
  • E2E tests added/updated (if applicable)

Additional validation:

  • Built and ran current and previous gateway binaries with a restricted PATH; current commit selected Podman, previous commit selected Docker.

Checklist

  • Follows Conventional Commits
  • Commits are signed off (DCO)
  • Architecture docs updated (if applicable)

@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented May 22, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 22, 2026

All contributors have signed the DCO ✍️ ✅
Posted by the DCO Assistant Lite bot.

@krishicks
Copy link
Copy Markdown
Collaborator Author

I have read the DCO document and I hereby sign the DCO.

@krishicks
Copy link
Copy Markdown
Collaborator Author

recheck

@krishicks krishicks force-pushed the hicks/podman-autodetect branch 2 times, most recently from b281fe7 to 6e7e986 Compare May 22, 2026 23:22
@drew
Copy link
Copy Markdown
Collaborator

drew commented May 23, 2026

/ok to test 6e7e986

@krishicks krishicks force-pushed the hicks/podman-autodetect branch 2 times, most recently from 7ae95dc to 9abd176 Compare May 25, 2026 17:17
@pimlock
Copy link
Copy Markdown
Collaborator

pimlock commented May 25, 2026

/ok to test 9abd176

@johntmyers johntmyers added the gator:in-review Gator is reviewing or awaiting PR review feedback label Jun 2, 2026
@johntmyers
Copy link
Copy Markdown
Collaborator

gator-agent

PR Review Status

Validation: project-valid. The PR author is in @NVIDIA/openshell-maintainers, the PR is not draft, DCO is passing, /ok to test has already been posted for the current head SHA, and duplicate searches did not find overlapping open or closed issues/PRs.
Head SHA: 9abd176bcdd777b60fe2c7bfef5456917affbc10

Review findings:

  • Warning: crates/openshell-core/src/config.rs appears to still miss one case called out in the PR description. The new Podman detection probes Podman-specific socket candidates, but Docker detection can still treat Docker-compatible socket paths such as /var/run/docker.sock, $HOME/.docker/run/docker.sock, $XDG_RUNTIME_DIR/docker.sock, or DOCKER_HOST=unix://... as Docker based on socket existence. If /var/run/docker.sock is a symlink to a Podman machine socket, Podman probing can return false, Docker probing can return true, and auto-detect can still select the wrong driver.
  • Suggested fix: include Docker-compatible socket candidates in the Podman Libpod probe before falling through to Docker detection, and add a unit test for a Libpod server reachable through a docker.sock-named path or symlink.

E2E decision: this affects Podman/Docker driver auto-detection. After the review finding is addressed, test:e2e should be applied before moving to pipeline watch.

Next state: gator:in-review.

@krishicks
Copy link
Copy Markdown
Collaborator Author

It's true that the Docker driver can fail if Docker is selected and /var/run/docker.sock points at a Podman machine socket. That was also true before this change: the Docker driver uses Bollard's local defaults, which resolve to DOCKER_HOST or /var/run/docker.sock, and this PR does not change Docker endpoint resolution.

This change is scoped to Podman autodetection. Podman autodetection probes the same Podman-specific socket paths that the Podman driver will use by default, plus OPENSHELL_PODMAN_SOCKET when explicitly set. It intentionally does not treat Docker-compatible socket paths like /var/run/docker.sock as Podman discovery inputs.

I think we should fix the Docker autodetect and socket selection in a separate change; Bollard's defaults aren't sufficient. And in that change we should do the same as this change does for Podman, keeping the auto-detection's candidate sockets and the client's actual socket selection aligned.

@johntmyers
Copy link
Copy Markdown
Collaborator

gator-agent

PR Review Status

Validation: project-valid. The PR remains scoped to Podman autodetection and the author clarified that Docker-compatible socket selection should be handled in a separate Docker autodetect/socket-selection change.
Head SHA: 9abd176bcdd777b60fe2c7bfef5456917affbc10

Review findings:

  • No blocking findings remain. The earlier Docker-compatible socket concern is not treated as a blocker for this Podman-scoped PR after the author's clarification.

E2E decision: this affects Podman/Docker driver auto-detection, so test:e2e is being applied before pipeline watch.

Next state: gator:watch-pipeline.

@johntmyers johntmyers added gator:watch-pipeline Gator is monitoring PR CI/CD status test:e2e Requires end-to-end coverage and removed gator:in-review Gator is reviewing or awaiting PR review feedback labels Jun 2, 2026
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jun 2, 2026

Label test:e2e applied for 9abd176. Open the existing run and click Re-run all jobs to execute with the label set. The run will execute the standard E2E suite after building the required gateway and supervisor images once. The matching required CI gate status on this PR will flip green automatically once the run finishes.

Auto-detection previously treated Podman as available only when the podman CLI
was visible on PATH. However, package manager services can run with a
restricted PATH, which lets Docker be selected even when a Podman API socket is
reachable. Additionally, podman may symlink /var/run/docker.sock to podman's
machine unix socket, which would be incorrectly detected as Docker. Worse
still: the podman machine may not even be running.

This replaces the Podman binary check with a functional HTTP probe against the
standard Podman socket paths. The probe requires /_ping to answer with a
Libpod-Api-Version header before treating the socket as Podman, which lets the
gateway select the embedded Podman driver only when the API is usable.

Signed-off-by: Kris Hicks <khicks@nvidia.com>
@krishicks krishicks force-pushed the hicks/podman-autodetect branch from 9abd176 to 033aad8 Compare June 3, 2026 01:19
@krishicks krishicks merged commit 1f07bf0 into main Jun 3, 2026
61 of 63 checks passed
@krishicks krishicks deleted the hicks/podman-autodetect branch June 3, 2026 15:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

gator:watch-pipeline Gator is monitoring PR CI/CD status test:e2e Requires end-to-end coverage

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants